Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 1551 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.2 MiB |
| Average record size in memory | 785.3 B |
Variable types
| CAT | 10 |
|---|---|
| NUM | 9 |
| BOOL | 1 |
Reproduction
| Analysis started | 2020-11-06 23:19:41.253851 |
|---|---|
| Analysis finished | 2020-11-06 23:20:09.459704 |
| Version | pandas-profiling v2.6.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
can_id has a high cardinality: 1551 distinct values | High cardinality |
can_nam has a high cardinality: 1545 distinct values | High cardinality |
can_off_sta has a high cardinality: 57 distinct values | High cardinality |
can_cit has a high cardinality: 939 distinct values | High cardinality |
can_sta has a high cardinality: 56 distinct values | High cardinality |
cov_sta_dat has a high cardinality: 180 distinct values | High cardinality |
cov_end_dat has a high cardinality: 126 distinct values | High cardinality |
net_ope_exp is highly correlated with ind_con and 5 other fields | High Correlation |
ind_con is highly correlated with net_ope_exp and 5 other fields | High Correlation |
tot_con is highly correlated with ind_con and 5 other fields | High Correlation |
tot_dis is highly correlated with ind_con and 4 other fields | High Correlation |
net_con is highly correlated with ind_con and 2 other fields | High Correlation |
ope_exp is highly correlated with ind_con and 4 other fields | High Correlation |
tot_rec is highly correlated with ind_con and 4 other fields | High Correlation |
can_sta is highly correlated with can_off_sta | High Correlation |
can_off_sta is highly correlated with can_sta | High Correlation |
ind_con is highly skewed (γ1 = 23.73771695) | Skewed |
net_ope_exp is highly skewed (γ1 = 24.63124139) | Skewed |
tot_con is highly skewed (γ1 = 22.70877387) | Skewed |
tot_dis is highly skewed (γ1 = 22.03489511) | Skewed |
net_con is highly skewed (γ1 = 26.97269546) | Skewed |
ope_exp is highly skewed (γ1 = 22.40544929) | Skewed |
tot_rec is highly skewed (γ1 = 22.24268841) | Skewed |
can_off_dis has 274 (17.7%) zeros | Zeros |
| Distinct count | 1551 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.2 KiB |
| S6HI00271 | 1 |
|---|---|
| S0HI00126 | 1 |
| P40003576 | 1 |
| S0IN00095 | 1 |
| H6RI01112 | 1 |
| Other values (1546) |
| Value | Count | Frequency (%) | |
| S6HI00271 | 1 | 0.1% | |
| S0HI00126 | 1 | 0.1% | |
| P40003576 | 1 | 0.1% | |
| S0IN00095 | 1 | 0.1% | |
| H6RI01112 | 1 | 0.1% | |
| H6MI01226 | 1 | 0.1% | |
| H2MO08067 | 1 | 0.1% | |
| H8CA41139 | 1 | 0.1% | |
| S6CT05108 | 1 | 0.1% | |
| H6FL18147 | 1 | 0.1% | |
| Other values (1541) | 1541 | 99.4% |
Length
| Max length | 9 |
|---|---|
| Mean length | 9 |
| Min length | 9 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 24 | 70.6% | |
| Decimal_Number | 10 | 29.4% |
| Value | Count | Frequency (%) | |
| Latin | 24 | 70.6% | |
| Common | 10 | 29.4% |
| Value | Count | Frequency (%) | |
| ASCII | 34 | 100.0% |
| Distinct count | 1545 |
|---|---|
| Unique (%) | 99.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.2 KiB |
| FLYNN, MICHAEL | 2 |
|---|---|
| SPOTORNO, FRANK | 2 |
| MARSHALL, ROBERT | 2 |
| PAUL, RAND | 2 |
| BURK, JOHN GUNTHER JR | 2 |
| Other values (1540) |
| Value | Count | Frequency (%) | |
| FLYNN, MICHAEL | 2 | 0.1% | |
| SPOTORNO, FRANK | 2 | 0.1% | |
| MARSHALL, ROBERT | 2 | 0.1% | |
| PAUL, RAND | 2 | 0.1% | |
| BURK, JOHN GUNTHER JR | 2 | 0.1% | |
| RUBIO, MARCO | 2 | 0.1% | |
| SCALISE, STEVE MR. | 1 | 0.1% | |
| KIRK, MARK STEVEN | 1 | 0.1% | |
| JOHNSON, BILL | 1 | 0.1% | |
| MCKINLEY, DAVID B. MR. | 1 | 0.1% | |
| Other values (1535) | 1535 | 99.0% |
Length
| Max length | 36 |
|---|---|
| Mean length | 17.31914894 |
| Min length | 8 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 26 | 74.3% | |
| Other_Punctuation | 5 | 14.3% | |
| Open_Punctuation | 1 | 2.9% | |
| Space_Separator | 1 | 2.9% | |
| Dash_Punctuation | 1 | 2.9% | |
| Close_Punctuation | 1 | 2.9% |
| Value | Count | Frequency (%) | |
| Latin | 26 | 74.3% | |
| Common | 9 | 25.7% |
| Value | Count | Frequency (%) | |
| ASCII | 35 | 100.0% |
can_off
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.2 KiB |
| H | |
|---|---|
| S | 188 |
| P | 45 |
| Value | Count | Frequency (%) | |
| H | 1318 | 85.0% | |
| S | 188 | 12.1% | |
| P | 45 | 2.9% |
Length
| Max length | 1 |
|---|---|
| Mean length | 1 |
| Min length | 1 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 3 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 3 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 3 | 100.0% |
| Distinct count | 57 |
|---|---|
| Unique (%) | 3.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.2 KiB |
| CA | 164 |
|---|---|
| FL | 144 |
| TX | 100 |
| NY | 85 |
| NC | 63 |
| Other values (52) |
| Value | Count | Frequency (%) | |
| CA | 164 | 10.6% | |
| FL | 144 | 9.3% | |
| TX | 100 | 6.4% | |
| NY | 85 | 5.5% | |
| NC | 63 | 4.1% | |
| IL | 62 | 4.0% | |
| PA | 56 | 3.6% | |
| MD | 50 | 3.2% | |
| US | 45 | 2.9% | |
| OH | 43 | 2.8% | |
| Other values (47) | 739 | 47.6% |
Length
| Max length | 2 |
|---|---|
| Mean length | 2 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 24 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 24 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 24 | 100.0% |
| Distinct count | 54 |
|---|---|
| Unique (%) | 3.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.414571244 |
|---|---|
| Minimum | 0 |
| Maximum | 53 |
| Zeros | 274 |
| Zeros (%) | 17.7% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 5 |
| Q3 | 11 |
| 95-th percentile | 31 |
| Maximum | 53 |
| Range | 53 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 10.24022526 |
|---|---|
| Coefficient of variation (CV) | 1.216963403 |
| Kurtosis | 4.338587712 |
| Mean | 8.414571244 |
| Median Absolute Deviation (MAD) | 7.334474836 |
| Skewness | 2.014306308 |
| Sum | 13051 |
| Variance | 104.8622133 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 3.5 8.5 13.5 19.5 27.5 53. ], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 0 | 274 | 17.7% | |
| 1 | 140 | 9.0% | |
| 2 | 112 | 7.2% | |
| 3 | 112 | 7.2% | |
| 4 | 94 | 6.1% | |
| 8 | 90 | 5.8% | |
| 5 | 85 | 5.5% | |
| 6 | 76 | 4.9% | |
| 7 | 66 | 4.3% | |
| 9 | 53 | 3.4% | |
| Other values (44) | 449 | 28.9% |
| Value | Count | Frequency (%) | |
| 0 | 274 | 17.7% | |
| 1 | 140 | 9.0% | |
| 2 | 112 | 7.2% | |
| 3 | 112 | 7.2% | |
| 4 | 94 | 6.1% |
| Value | Count | Frequency (%) | |
| 53 | 3 | 0.2% | |
| 52 | 5 | 0.3% | |
| 51 | 2 | 0.1% | |
| 50 | 2 | 0.1% | |
| 49 | 2 | 0.1% |
can_par_aff
Categorical
| Distinct count | 18 |
|---|---|
| Unique (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.2 KiB |
| REP | |
|---|---|
| DEM | |
| IND | 38 |
| LIB | 24 |
| GRE | 11 |
| Other values (13) | 41 |
| Value | Count | Frequency (%) | |
| REP | 789 | 50.9% | |
| DEM | 648 | 41.8% | |
| IND | 38 | 2.5% | |
| LIB | 24 | 1.5% | |
| GRE | 11 | 0.7% | |
| OTH | 9 | 0.6% | |
| UNK | 6 | 0.4% | |
| NNE | 5 | 0.3% | |
| NPA | 5 | 0.3% | |
| DFL | 4 | 0.3% | |
| Other values (8) | 12 | 0.8% |
Length
| Max length | 3 |
|---|---|
| Mean length | 2.996776273 |
| Min length | 1 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 19 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 19 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 19 | 100.0% |
can_inc_cha_ope_sea
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.2 KiB |
| CHALLENGER | |
|---|---|
| INCUMBENT | |
| OPEN |
| Value | Count | Frequency (%) | |
| CHALLENGER | 735 | 47.4% | |
| INCUMBENT | 419 | 27.0% | |
| OPEN | 397 | 25.6% |
Length
| Max length | 10 |
|---|---|
| Mean length | 8.194068343 |
| Min length | 4 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 15 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 15 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 15 | 100.0% |
| Distinct count | 939 |
|---|---|
| Unique (%) | 60.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.2 KiB |
| LAS VEGAS | 19 |
|---|---|
| NEW YORK | 16 |
| LOS ANGELES | 14 |
| CHICAGO | 14 |
| MIAMI | 13 |
| Other values (934) |
| Value | Count | Frequency (%) | |
| LAS VEGAS | 19 | 1.2% | |
| NEW YORK | 16 | 1.0% | |
| LOS ANGELES | 14 | 0.9% | |
| CHICAGO | 14 | 0.9% | |
| MIAMI | 13 | 0.8% | |
| HOUSTON | 12 | 0.8% | |
| INDIANAPOLIS | 11 | 0.7% | |
| WASHINGTON | 11 | 0.7% | |
| ORLANDO | 11 | 0.7% | |
| JACKSONVILLE | 10 | 0.6% | |
| Other values (929) | 1420 | 91.6% |
Length
| Max length | 20 |
|---|---|
| Mean length | 8.916827853 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 26 | 86.7% | |
| Other_Punctuation | 2 | 6.7% | |
| Space_Separator | 1 | 3.3% | |
| Dash_Punctuation | 1 | 3.3% |
| Value | Count | Frequency (%) | |
| Latin | 26 | 86.7% | |
| Common | 4 | 13.3% |
| Value | Count | Frequency (%) | |
| ASCII | 30 | 100.0% |
| Distinct count | 56 |
|---|---|
| Unique (%) | 3.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.2 KiB |
| CA | 167 |
|---|---|
| FL | 149 |
| TX | 100 |
| NY | 85 |
| IL | 64 |
| Other values (51) |
| Value | Count | Frequency (%) | |
| CA | 167 | 10.8% | |
| FL | 149 | 9.6% | |
| TX | 100 | 6.4% | |
| NY | 85 | 5.5% | |
| IL | 64 | 4.1% | |
| NC | 63 | 4.1% | |
| PA | 57 | 3.7% | |
| MD | 49 | 3.2% | |
| OH | 44 | 2.8% | |
| VA | 42 | 2.7% | |
| Other values (46) | 731 | 47.1% |
Length
| Max length | 2 |
|---|---|
| Mean length | 2 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 24 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 24 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 24 | 100.0% |
can_zip
Real number (ℝ≥0)
| Distinct count | 1419 |
|---|---|
| Unique (%) | 91.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 54554585.17 |
|---|---|
| Minimum | 603 |
| Maximum | 989449767 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 603 |
|---|---|
| 5-th percentile | 7751 |
| Q1 | 28605.5 |
| median | 53216 |
| Q3 | 89139 |
| 95-th percentile | 535461697 |
| Maximum | 989449767 |
| Range | 989449164 |
| Interquartile range (IQR) | 60533.5 |
Descriptive statistics
| Standard deviation | 183220823.6 |
|---|---|
| Coefficient of variation (CV) | 3.358486241 |
| Kurtosis | 11.62751188 |
| Mean | 54554585.17 |
| Median Absolute Deviation (MAD) | 98165446.25 |
| Skewness | 3.527330958 |
| Sum | 8.461416159e+10 |
| Variance | 3.356987022e+16 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[6.03000000e+02 1.00095000e+04 1.00385000e+04 1.17500000e+04 1.90175000e+04 ... 9.80080000e+04 9.81210000e+04 9.96775000e+04 1.85538082e+08 9.89449767e+08], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 22314 | 4 | 0.3% | |
| 32801 | 4 | 0.3% | |
| 20910 | 3 | 0.2% | |
| 90017 | 3 | 0.2% | |
| 32174 | 3 | 0.2% | |
| 45373 | 3 | 0.2% | |
| 89074 | 3 | 0.2% | |
| 60540 | 3 | 0.2% | |
| 30263 | 3 | 0.2% | |
| 32853 | 3 | 0.2% | |
| Other values (1409) | 1519 | 97.9% |
| Value | Count | Frequency (%) | |
| 603 | 1 | 0.1% | |
| 680 | 1 | 0.1% | |
| 791 | 1 | 0.1% | |
| 841 | 1 | 0.1% | |
| 920 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 989449767 | 1 | 0.1% | |
| 986420020 | 1 | 0.1% | |
| 985083041 | 1 | 0.1% | |
| 970311456 | 1 | 0.1% | |
| 959880984 | 1 | 0.1% |
| Distinct count | 180 |
|---|---|
| Unique (%) | 11.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.2 KiB |
| 1/1/2015 | |
|---|---|
| 1/1/2016 | |
| 7/1/2015 | 107 |
| 4/1/2016 | 103 |
| 10/1/2015 | 94 |
| Other values (175) |
| Value | Count | Frequency (%) | |
| 1/1/2015 | 664 | 42.8% | |
| 1/1/2016 | 234 | 15.1% | |
| 7/1/2015 | 107 | 6.9% | |
| 4/1/2016 | 103 | 6.6% | |
| 10/1/2015 | 94 | 6.1% | |
| 4/1/2015 | 82 | 5.3% | |
| 7/1/2016 | 18 | 1.2% | |
| 12/1/2015 | 9 | 0.6% | |
| 9/1/2015 | 7 | 0.5% | |
| 6/1/2015 | 7 | 0.5% | |
| Other values (170) | 226 | 14.6% |
Length
| Max length | 10 |
|---|---|
| Mean length | 8.203739523 |
| Min length | 8 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 90.9% | |
| Other_Punctuation | 1 | 9.1% |
| Value | Count | Frequency (%) | |
| Common | 11 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 11 | 100.0% |
| Distinct count | 126 |
|---|---|
| Unique (%) | 8.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.2 KiB |
| 10/19/2016 | |
|---|---|
| 9/30/2016 | |
| 6/30/2016 | 73 |
| 3/31/2016 | 32 |
| 12/31/2015 | 17 |
| Other values (121) |
| Value | Count | Frequency (%) | |
| 10/19/2016 | 862 | 55.6% | |
| 9/30/2016 | 354 | 22.8% | |
| 6/30/2016 | 73 | 4.7% | |
| 3/31/2016 | 32 | 2.1% | |
| 12/31/2015 | 17 | 1.1% | |
| 10/15/2016 | 11 | 0.7% | |
| 9/30/2015 | 10 | 0.6% | |
| 11/28/2016 | 9 | 0.6% | |
| 8/10/2016 | 7 | 0.5% | |
| 7/15/2016 | 6 | 0.4% | |
| Other values (116) | 170 | 11.0% |
Length
| Max length | 10 |
|---|---|
| Mean length | 9.585428756 |
| Min length | 8 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 90.9% | |
| Other_Punctuation | 1 | 9.1% |
| Value | Count | Frequency (%) | |
| Common | 11 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 11 | 100.0% |
| Distinct count | 1521 |
|---|---|
| Unique (%) | 98.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 974037.4839 |
|---|---|
| Minimum | 5 |
| Maximum | 231831604.4 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 726 |
| Q1 | 15537.44 |
| median | 134465.05 |
| Q3 | 563404.17 |
| 95-th percentile | 2265976.015 |
| Maximum | 231831604.4 |
| Range | 231831599.4 |
| Interquartile range (IQR) | 547866.73 |
Descriptive statistics
| Standard deviation | 7354090.908 |
|---|---|
| Coefficient of variation (CV) | 7.550110781 |
| Kurtosis | 672.4994603 |
| Mean | 974037.4839 |
| Median Absolute Deviation (MAD) | 1315618.833 |
| Skewness | 23.73771695 |
| Sum | 1510732137 |
| Variance | 5.408265309e+13 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[5.00000000e+00 2.45000000e+02 1.12173500e+03 7.64450000e+03 1.89335000e+04 ... 1.44887120e+06 2.23481619e+06 3.80798662e+06 1.43565620e+07 2.31831604e+08], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 150 | 4 | 0.3% | |
| 200 | 4 | 0.3% | |
| 950 | 3 | 0.2% | |
| 700 | 3 | 0.2% | |
| 50 | 3 | 0.2% | |
| 100 | 3 | 0.2% | |
| 500 | 3 | 0.2% | |
| 3550 | 2 | 0.1% | |
| 215 | 2 | 0.1% | |
| 1400 | 2 | 0.1% | |
| Other values (1511) | 1522 | 98.1% |
| Value | Count | Frequency (%) | |
| 5 | 1 | 0.1% | |
| 10 | 1 | 0.1% | |
| 15 | 1 | 0.1% | |
| 20 | 2 | 0.1% | |
| 25 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 231831604.4 | 1 | 0.1% | |
| 105799882.7 | 1 | 0.1% | |
| 92036123.51 | 1 | 0.1% | |
| 63461402.63 | 1 | 0.1% | |
| 45362044.95 | 1 | 0.1% |
| Distinct count | 1542 |
|---|---|
| Unique (%) | 99.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4384710.413 |
|---|---|
| Minimum | 1.8 |
| Maximum | 1954397343 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 1.8 |
|---|---|
| 5-th percentile | 3050.905 |
| Q1 | 25807.36 |
| median | 197032.64 |
| Q3 | 804576.485 |
| 95-th percentile | 3429628.83 |
| Maximum | 1954397343 |
| Range | 1954397342 |
| Interquartile range (IQR) | 778769.125 |
Descriptive statistics
| Standard deviation | 61660546.07 |
|---|---|
| Coefficient of variation (CV) | 14.06262678 |
| Kurtosis | 699.0083624 |
| Mean | 4384710.413 |
| Median Absolute Deviation (MAD) | 7453357.034 |
| Skewness | 24.63124139 |
| Sum | 6800685850 |
| Variance | 3.802022941e+15 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.80000000e+00 1.60000000e+01 1.08685550e+04 2.00826150e+04 3.17612550e+04 ... 2.45639220e+06 3.89883309e+06 1.44552169e+07 1.26103406e+08 1.95439734e+09], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 1915655.96 | 2 | 0.1% | |
| 6988 | 2 | 0.1% | |
| 77314.34 | 2 | 0.1% | |
| 8576.29 | 2 | 0.1% | |
| 5700 | 2 | 0.1% | |
| 83145.56 | 2 | 0.1% | |
| 212 | 2 | 0.1% | |
| 6000 | 2 | 0.1% | |
| 1150 | 2 | 0.1% | |
| 304961.94 | 1 | 0.1% | |
| Other values (1532) | 1532 | 98.8% |
| Value | Count | Frequency (%) | |
| 1.8 | 1 | 0.1% | |
| 4.59 | 1 | 0.1% | |
| 10 | 1 | 0.1% | |
| 11 | 1 | 0.1% | |
| 15 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 1954397343 | 1 | 0.1% | |
| 923650997.6 | 1 | 0.1% | |
| 724308424.2 | 1 | 0.1% | |
| 673695919.4 | 1 | 0.1% | |
| 394899772.8 | 1 | 0.1% |
| Distinct count | 1539 |
|---|---|
| Unique (%) | 99.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1262476.528 |
|---|---|
| Minimum | 10 |
| Maximum | 231837226.4 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 1946.245 |
| Q1 | 20956 |
| median | 168808.53 |
| Q3 | 1066731.79 |
| 95-th percentile | 3511399.245 |
| Maximum | 231837226.4 |
| Range | 231837216.4 |
| Interquartile range (IQR) | 1045775.79 |
Descriptive statistics
| Standard deviation | 7508667.972 |
|---|---|
| Coefficient of variation (CV) | 5.947570357 |
| Kurtosis | 624.5412706 |
| Mean | 1262476.528 |
| Median Absolute Deviation (MAD) | 1572373.082 |
| Skewness | 22.70877387 |
| Sum | 1958101094 |
| Variance | 5.638009471e+13 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.00000000e+01 6.97000000e+02 1.03680000e+04 3.20197650e+04 1.08367545e+05 ... 2.80897135e+06 4.75468138e+06 1.39049995e+07 2.02897909e+07 2.31837226e+08], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 500 | 2 | 0.1% | |
| 750 | 2 | 0.1% | |
| 7300 | 2 | 0.1% | |
| 101148.61 | 2 | 0.1% | |
| 1851509.95 | 2 | 0.1% | |
| 3550 | 2 | 0.1% | |
| 10400 | 2 | 0.1% | |
| 600 | 2 | 0.1% | |
| 2700 | 2 | 0.1% | |
| 380 | 2 | 0.1% | |
| Other values (1529) | 1531 | 98.7% |
| Value | Count | Frequency (%) | |
| 10 | 1 | 0.1% | |
| 20 | 2 | 0.1% | |
| 30 | 1 | 0.1% | |
| 67 | 1 | 0.1% | |
| 84.83 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 231837226.4 | 1 | 0.1% | |
| 114492189.3 | 1 | 0.1% | |
| 92137218.65 | 1 | 0.1% | |
| 63466990.92 | 1 | 0.1% | |
| 45818117.38 | 1 | 0.1% |
| Distinct count | 1542 |
|---|---|
| Unique (%) | 99.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1348706.665 |
|---|---|
| Minimum | 1.8 |
| Maximum | 238962741.3 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 1.8 |
|---|---|
| 5-th percentile | 4151.03 |
| Q1 | 29161.555 |
| median | 239804.03 |
| Q3 | 966803.295 |
| 95-th percentile | 3548315.6 |
| Maximum | 238962741.3 |
| Range | 238962739.5 |
| Interquartile range (IQR) | 937641.74 |
Descriptive statistics
| Standard deviation | 9244104.991 |
|---|---|
| Coefficient of variation (CV) | 6.854051536 |
| Kurtosis | 539.3898085 |
| Mean | 1348706.665 |
| Median Absolute Deviation (MAD) | 1717476.814 |
| Skewness | 22.03489511 |
| Sum | 2091844038 |
| Variance | 8.545347709e+13 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.80000000e+00 1.64535000e+04 3.16909700e+04 8.11371150e+04 1.45140505e+05 ... 2.45932472e+06 4.30110120e+06 1.35942085e+07 2.22781545e+07 2.38962741e+08], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 8576.29 | 2 | 0.1% | |
| 1150 | 2 | 0.1% | |
| 1970657.96 | 2 | 0.1% | |
| 482422.91 | 2 | 0.1% | |
| 46196 | 2 | 0.1% | |
| 83369.66 | 2 | 0.1% | |
| 20000 | 2 | 0.1% | |
| 77314.34 | 2 | 0.1% | |
| 212 | 2 | 0.1% | |
| 111269.03 | 1 | 0.1% | |
| Other values (1532) | 1532 | 98.8% |
| Value | Count | Frequency (%) | |
| 1.8 | 1 | 0.1% | |
| 15 | 1 | 0.1% | |
| 17 | 1 | 0.1% | |
| 18.92 | 1 | 0.1% | |
| 29.5 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 238962741.3 | 1 | 0.1% | |
| 232031346.9 | 1 | 0.1% | |
| 93373187.27 | 1 | 0.1% | |
| 64258231.64 | 1 | 0.1% | |
| 51191082.94 | 1 | 0.1% |
| Distinct count | 1537 |
|---|---|
| Unique (%) | 99.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4349405.511 |
|---|---|
| Minimum | 10 |
| Maximum | 2096279791 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 1780.5 |
| Q1 | 20956.645 |
| median | 167644.88 |
| Q3 | 1063289.95 |
| 95-th percentile | 3558825.865 |
| Maximum | 2096279791 |
| Range | 2096279781 |
| Interquartile range (IQR) | 1042333.305 |
Descriptive statistics
| Standard deviation | 62633106.12 |
|---|---|
| Coefficient of variation (CV) | 14.40038322 |
| Kurtosis | 836.3497738 |
| Mean | 4349405.511 |
| Median Absolute Deviation (MAD) | 7228835.44 |
| Skewness | 26.97269546 |
| Sum | 6745927948 |
| Variance | 3.922905983e+15 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.00000000e+01 2.77750000e+03 1.36545100e+04 3.23500000e+04 1.04157695e+05 ... 2.78668387e+06 4.64028406e+06 1.78534685e+07 1.32561365e+08 2.09627979e+09], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 380 | 2 | 0.1% | |
| 30 | 2 | 0.1% | |
| 450 | 2 | 0.1% | |
| 500 | 2 | 0.1% | |
| 2700 | 2 | 0.1% | |
| 101148.61 | 2 | 0.1% | |
| 1826909.95 | 2 | 0.1% | |
| 6500 | 2 | 0.1% | |
| 750 | 2 | 0.1% | |
| 1150 | 2 | 0.1% | |
| Other values (1527) | 1531 | 98.7% |
| Value | Count | Frequency (%) | |
| 10 | 1 | 0.1% | |
| 20 | 2 | 0.1% | |
| 30 | 2 | 0.1% | |
| 84.83 | 1 | 0.1% | |
| 85 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 2096279791 | 1 | 0.1% | |
| 827930839.9 | 1 | 0.1% | |
| 715555724.8 | 1 | 0.1% | |
| 468441873.4 | 1 | 0.1% | |
| 410046441 | 1 | 0.1% |
| Distinct count | 1542 |
|---|---|
| Unique (%) | 99.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1250837.711 |
|---|---|
| Minimum | 1.8 |
| Maximum | 238374891.3 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 1.8 |
|---|---|
| 5-th percentile | 3344.665 |
| Q1 | 25716.385 |
| median | 197156.1 |
| Q3 | 796958.075 |
| 95-th percentile | 3294936.93 |
| Maximum | 238374891.3 |
| Range | 238374889.5 |
| Interquartile range (IQR) | 771241.69 |
Descriptive statistics
| Standard deviation | 9064417.319 |
|---|---|
| Coefficient of variation (CV) | 7.246677358 |
| Kurtosis | 555.206951 |
| Mean | 1250837.711 |
| Median Absolute Deviation (MAD) | 1638259.716 |
| Skewness | 22.40544929 |
| Sum | 1940049290 |
| Variance | 8.216366134e+13 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.80000000e+00 1.60000000e+01 7.48344000e+03 2.00983000e+04 4.51750350e+04 ... 1.97415859e+06 3.91712380e+06 1.35092901e+07 2.19796380e+07 2.38374891e+08], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 6000 | 2 | 0.1% | |
| 1150 | 2 | 0.1% | |
| 83369.66 | 2 | 0.1% | |
| 1916057.96 | 2 | 0.1% | |
| 77314.34 | 2 | 0.1% | |
| 218 | 2 | 0.1% | |
| 6988 | 2 | 0.1% | |
| 212 | 2 | 0.1% | |
| 8576.29 | 2 | 0.1% | |
| 1448434.4 | 1 | 0.1% | |
| Other values (1532) | 1532 | 98.8% |
| Value | Count | Frequency (%) | |
| 1.8 | 1 | 0.1% | |
| 4.59 | 1 | 0.1% | |
| 11 | 1 | 0.1% | |
| 15 | 1 | 0.1% | |
| 17 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 238374891.3 | 1 | 0.1% | |
| 226685620.9 | 1 | 0.1% | |
| 87216086.77 | 1 | 0.1% | |
| 61598973.9 | 1 | 0.1% | |
| 46337371.79 | 1 | 0.1% |
| Distinct count | 1540 |
|---|---|
| Unique (%) | 99.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1480109.543 |
|---|---|
| Minimum | 10 |
| Maximum | 254957194.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 4705.235 |
| Q1 | 32032.765 |
| median | 255237.7 |
| Q3 | 1154644.285 |
| 95-th percentile | 3880867.285 |
| Maximum | 254957194.9 |
| Range | 254957184.9 |
| Interquartile range (IQR) | 1122611.52 |
Descriptive statistics
| Standard deviation | 9610770.654 |
|---|---|
| Coefficient of variation (CV) | 6.493283354 |
| Kurtosis | 550.395497 |
| Mean | 1480109.543 |
| Median Absolute Deviation (MAD) | 1837178.231 |
| Skewness | 22.24268841 |
| Sum | 2295649900 |
| Variance | 9.236691257e+13 |
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.00000000e+01 1.86324700e+04 3.14512250e+04 7.94808350e+04 2.10186135e+05 ... 2.36960292e+06 3.90605949e+06 5.30635959e+06 2.09268335e+07 2.54957195e+08], "bayesian blocks" binning strategy used)
| Value | Count | Frequency (%) | |
| 7220 | 2 | 0.1% | |
| 1150 | 2 | 0.1% | |
| 7875.97 | 2 | 0.1% | |
| 24710 | 2 | 0.1% | |
| 600 | 2 | 0.1% | |
| 1882950.05 | 2 | 0.1% | |
| 139424.1 | 2 | 0.1% | |
| 7300 | 2 | 0.1% | |
| 3550 | 2 | 0.1% | |
| 20000 | 2 | 0.1% | |
| Other values (1530) | 1531 | 98.7% |
| Value | Count | Frequency (%) | |
| 10 | 1 | 0.1% | |
| 20 | 1 | 0.1% | |
| 30 | 1 | 0.1% | |
| 67.02 | 1 | 0.1% | |
| 110 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 254957194.9 | 1 | 0.1% | |
| 236804528.5 | 1 | 0.1% | |
| 93469852.86 | 1 | 0.1% | |
| 65063783.43 | 1 | 0.1% | |
| 48126114.3 | 1 | 0.1% |
winner
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.2 KiB |
| N | |
|---|---|
| Y |
| Value | Count | Frequency (%) | |
| N | 1087 | 70.1% | |
| Y | 464 | 29.9% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
First rows
| can_id | can_nam | can_off | can_off_sta | can_off_dis | can_par_aff | can_inc_cha_ope_sea | can_cit | can_sta | can_zip | cov_sta_dat | cov_end_dat | ind_con | net_ope_exp | tot_con | tot_dis | net_con | ope_exp | tot_rec | winner | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | H2GA12121 | ALLEN, RICHARD W | H | GA | 12.0 | REP | INCUMBENT | AUGUSTA | GA | 30904.0 | 1/1/2015 | 10/19/2016 | 601274.50 | 907156.21 | 1074949.50 | 978518.98 | 1074949.50 | 908518.98 | 1094022.76 | Y |
| 1 | H6PA02171 | EVANS, DWIGHT | H | PA | 2.0 | DEM | CHALLENGER | PHILADELPHIA | PA | 19138.0 | 11/2/2015 | 10/19/2016 | 1114711.02 | 1298831.83 | 1417545.22 | 1313583.69 | 1406719.06 | 1300557.53 | 1419270.92 | Y |
| 2 | H6FL04105 | RUTHERFORD, JOHN | H | FL | 4.0 | REP | OPEN | JACKSONVILLE | FL | 32224.0 | 4/1/2016 | 10/19/2016 | 542105.38 | 656210.29 | 650855.38 | 675642.76 | 650855.38 | 656642.76 | 711287.85 | Y |
| 3 | H4MT01041 | ZINKE, RYAN K | H | MT | 0.0 | REP | INCUMBENT | WHITEFISH | MT | 599373010.0 | 1/1/2015 | 10/19/2016 | 4317331.58 | 5055942.15 | 4980915.41 | 5200630.00 | 4938943.74 | 5073110.33 | 5190887.78 | Y |
| 4 | H8CA09060 | LEE, BARBARA | H | CA | 13.0 | DEM | INCUMBENT | OAKLAND | CA | 94612.0 | 1/1/2015 | 10/19/2016 | 897123.61 | 949488.98 | 1205863.61 | 1112163.94 | 1197676.61 | 953436.94 | 1209811.57 | Y |
| 5 | H6NC04037 | PRICE, DAVID E. | H | NC | 4.0 | DEM | INCUMBENT | RALEIGH | NC | 27602.0 | 1/1/2015 | 10/19/2016 | 328804.52 | 430826.04 | 728854.52 | 675837.98 | 725854.52 | 435688.13 | 733716.61 | Y |
| 6 | H2WI02124 | POCAN, MARK | H | WI | 2.0 | DEM | INCUMBENT | MADISON | WI | 53701.0 | 1/1/2015 | 10/19/2016 | 393873.83 | 445438.15 | 970547.37 | 745903.44 | 970385.04 | 445465.15 | 970574.37 | Y |
| 7 | H2MA09072 | LYNCH, STEPHEN | H | MA | 8.0 | DEM | INCUMBENT | SOUTH BOSTON | MA | 2127.0 | 1/1/2015 | 10/19/2016 | 767049.56 | 459790.68 | 1092269.56 | 493047.23 | 1092218.56 | 464636.23 | 1097115.11 | Y |
| 8 | H6OR02116 | WALDEN, GREGORY P MR. | H | OR | 2.0 | REP | INCUMBENT | HOOD RIVER | OR | 970311456.0 | 1/1/2015 | 10/19/2016 | 969437.03 | 1911215.54 | 3012350.64 | 2866919.77 | 3004650.64 | 1937694.04 | 3134128.86 | Y |
| 9 | H2MA04073 | KENNEDY, JOSEPH P III | H | MA | 4.0 | DEM | INCUMBENT | NEWTON | MA | 2459.0 | 1/1/2015 | 10/19/2016 | 1938192.38 | 1537844.39 | 2797967.38 | 1553016.80 | 2784362.26 | 1539411.68 | 3011173.93 | Y |
Last rows
| can_id | can_nam | can_off | can_off_sta | can_off_dis | can_par_aff | can_inc_cha_ope_sea | can_cit | can_sta | can_zip | cov_sta_dat | cov_end_dat | ind_con | net_ope_exp | tot_con | tot_dis | net_con | ope_exp | tot_rec | winner | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1541 | H6MI10185 | FLYNN, MICHAEL | H | MI | 10.0 | REP | OPEN | SHELBY TOWNSHIP | MI | 48318.0 | 1/1/2015 | 9/30/2015 | 42250.00 | 53693.21 | 42250.00 | 60693.21 | 42250.00 | 53693.21 | 67252.81 | N |
| 1542 | H6IL18153 | MELLON, ROBERT | H | IL | 18.0 | DEM | OPEN | QUINCY | IL | 62305.0 | 4/1/2015 | 9/30/2015 | 21538.00 | 19007.00 | 21538.00 | 19007.00 | 21538.00 | 19007.00 | 21538.00 | N |
| 1543 | H6MS01180 | MILLS, MICHAEL P. JR | H | MS | 1.0 | REP | OPEN | FULTON | MS | 38843.0 | 1/1/2015 | 9/30/2015 | 98100.00 | 178100.00 | 101600.00 | 181600.00 | 101100.00 | 178100.00 | 181600.00 | N |
| 1544 | H6MS01172 | PIRKLE, GREGORY D. | H | MS | 1.0 | REP | OPEN | TUPELO | MS | 38802.0 | 1/1/2015 | 9/30/2015 | 213756.00 | 459834.76 | 213756.00 | 462392.04 | 212756.00 | 460070.80 | 462392.04 | N |
| 1545 | H6CA46124 | SANCHEZ, HEBERTO M | H | CA | 46.0 | DEM | OPEN | POMONA | CA | 91766.0 | 6/10/2015 | 9/30/2015 | 3318.96 | 3111.94 | 3318.96 | 3318.96 | 3111.94 | 3111.94 | 3318.96 | N |
| 1546 | P00004275 | BROWN, HARLEY D | P | US | 0.0 | NNE | OPEN | NAMPA | ID | 83686.0 | 1/1/2015 | 7/8/2015 | 215.00 | 9655.02 | 12847.39 | 11683.89 | 10487.10 | 11683.89 | 12847.39 | N |
| 1547 | H6NY11182 | LANE, JAMES | H | NY | 11.0 | GRE | OPEN | BROOKLYN | NY | 11215.0 | 1/1/2015 | 7/7/2015 | 12889.00 | 13356.89 | 14241.00 | 13983.11 | 14241.00 | 13356.89 | 14241.00 | N |
| 1548 | H6MS01164 | COLLINS, NANCY | H | MS | 1.0 | REP | OPEN | TUPELO | MS | 38804.0 | 1/1/2015 | 7/1/2015 | 95538.35 | 247121.35 | 102538.35 | 247121.35 | 102538.35 | 247121.35 | 247121.35 | N |
| 1549 | S6CA00618 | ALBERTSON, STEWART | S | CA | 0.0 | DEM | OPEN | REDWOOD CITY | CA | 94065.0 | 1/1/2015 | 6/30/2015 | 18949.00 | 15221.00 | 20949.00 | 30949.00 | 15250.00 | 15221.00 | 30949.00 | N |
| 1550 | H6MS01198 | JONES, ROGER STARNER DR. | H | MS | 1.0 | REP | OPEN | PONTOTOC | MS | 38863.0 | 1/1/2015 | 6/30/2015 | 25808.00 | 528638.39 | 140858.00 | 538358.00 | 140858.00 | 528638.39 | 538358.00 | N |